Skip to content

Call module_close() after freeing the string name to avoid a segfault#33

Closed
tyler274 wants to merge 1 commit into
latchset:mainfrom
tyler274:patch-1
Closed

Call module_close() after freeing the string name to avoid a segfault#33
tyler274 wants to merge 1 commit into
latchset:mainfrom
tyler274:patch-1

Conversation

@tyler274

@tyler274 tyler274 commented Jun 17, 2026

Copy link
Copy Markdown

In the existing order there is a valgrind confirmed segfault when we free record->filename after the module was closed.

Will test this some more too! Tests pass locally.

In the existing order there is a valgrind confirmed segfault when we free `record->filename` after the module was closed.

Signed-off-by: tyler <tyler274port@gmail.com>
@tyler274

tyler274 commented Jun 17, 2026

Copy link
Copy Markdown
Author

I tested this change after packaging the library for NixOS with a direnv enabled build environment. Bumped the version over there but can cherry pick the changes into this branch for 0.3.3.

Ill try to pull this into gssproxy to see if the C version with valgrind still reports a segfault after I wake up.

@simo5 simo5 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change does absolutely nothing (except adding unwanted spaces), please show a stack trace with the supposed segfault or a provide a reproducer

@tyler274

tyler274 commented Jun 17, 2026

Copy link
Copy Markdown
Author

This change does absolutely nothing (except adding unwanted spaces), please show a stack trace with the supposed segfault or a provide a reproducer

Will do. This manifested and is mentioned here gssapi/gssproxy#14

The segfault goes away when the function is disabled during configuration.

@tyler274

tyler274 commented Jun 17, 2026

Copy link
Copy Markdown
Author

@simo5 As requested:

If I leave the function enabled at all in gssproxy it errors.

=== Binary: /nix/store/rzikw3qzjasaam0cicns5j8n3vzmg6kc-gssproxy-0.9.2/bin/gssproxy ===
=== Tmpdir: /tmp/nix-shell.DWFgWr/tmp.edelSCfJTv ===
valgrind PID: 2422135
Waiting for readiness...
--- daemon stderr ---
--- sending SIGTERM ---
--- exit code: 0 ---

=== Valgrind memcheck output ===
==2422135== Memcheck, a memory error detector
==2422135== Copyright (C) 2002-2024, and GNU GPL'd, by Julian Seward et al.
==2422135== Using Valgrind-3.26.0 and LibVEX; rerun with -h for copyright info
==2422135== Command: /nix/store/rzikw3qzjasaam0cicns5j8n3vzmg6kc-gssproxy-0.9.2/bin/gssproxy -i -s /tmp/nix-shell.DWFgWr/tmp.edelSCfJTv/gp.sock -c /tmp/nix-shell.DWFgWr/tmp.edelSCfJTv/gp.conf
==2422135== Parent PID: 2421859
==2422135== 
==2422135== Invalid free() / delete / delete[] / realloc()
==2422135==    at 0x486FC0B: free (in /nix/store/z3k1ih7g5njqnbhns0anx8mw6zqjs1z3-valgrind-3.26.0/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==2422135==    by 0x4BF0231: verto_cleanup (in /nix/store/5cbjxz04crbjnawysjx5q717gvbsx7k8-krb5-1.22.1-lib/lib/libverto.so.0.0)
==2422135==    by 0x4007741: main (in /nix/store/rzikw3qzjasaam0cicns5j8n3vzmg6kc-gssproxy-0.9.2/bin/gssproxy)
==2422135==  Address 0x4bf4021 is in a r-- mapped file /nix/store/5cbjxz04crbjnawysjx5q717gvbsx7k8-krb5-1.22.1-lib/lib/libverto.so.0.0 segment
==2422135== 
==2422135== Invalid free() / delete / delete[] / realloc()
==2422135==    at 0x486FC0B: free (in /nix/store/z3k1ih7g5njqnbhns0anx8mw6zqjs1z3-valgrind-3.26.0/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==2422135==    by 0x4BF028D: verto_cleanup (in /nix/store/5cbjxz04crbjnawysjx5q717gvbsx7k8-krb5-1.22.1-lib/lib/libverto.so.0.0)
==2422135==    by 0x4007741: main (in /nix/store/rzikw3qzjasaam0cicns5j8n3vzmg6kc-gssproxy-0.9.2/bin/gssproxy)
==2422135==  Address 0x4bf7020 is 0 bytes inside data symbol "builtin_record"
==2422135== 
==2422135== 
==2422135== HEAP SUMMARY:
==2422135==     in use at exit: 688 bytes in 15 blocks
==2422135==   total heap usage: 2,906 allocs, 2,893 frees, 281,483 bytes allocated
==2422135== 
==2422135== For a detailed leak analysis, rerun with: --leak-check=full
==2422135== 
==2422135== For lists of detected and suppressed errors, rerun with: -s
==2422135== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)```

@tyler274

Copy link
Copy Markdown
Author

Reproducer is taking a bit longer, as the script I have expects the reproducible build environment I set up in the gssproxy for NixOS.

@tyler274

tyler274 commented Jun 17, 2026

Copy link
Copy Markdown
Author

Here is my kernel hardening configuration in the meantime:

https://github.com/tyler274/nixos/blob/main/modules/nixos/hardening.nix

@tyler274

tyler274 commented Jun 18, 2026

Copy link
Copy Markdown
Author

Reproducer script here. Drop it in a fresh checkout of gssproxy and it should be able to setup the build environment and reproduce.

#!/usr/bin/env bash
#
# Reproducer for the gssproxy verto_cleanup double-free.
#
# When gssproxy is linked against krb5's embedded (BUILTIN_MODULE) libverto,
# calling verto_cleanup() on exit attempts to free:
#   - a static struct  ("builtin_record" in libverto.so)
#   - a string literal (the empty module-name baked into the .so read-only segment)
# Neither is heap memory, so the free() aborts the daemon on every clean
# shutdown ("free(): invalid pointer").
#
# This script builds the C gssproxy from this checkout and runs it under
# valgrind, then sends SIGTERM to trigger the exit path. Everything it needs
# (a pinned nixpkgs, valgrind, and an inline gssproxy build) is fetched/defined
# by Nix here, so it is fully self-contained: it does NOT use the repo's nix/
# packaging and runs anywhere Nix is installed -- NixOS, Fedora, Debian, etc.
# The only prerequisite is a working `nix` (single- or multi-user); no channels,
# NIX_PATH, or flake configuration are required.
#
# Usage (from anywhere):
#   ./reproduce-verto-cleanup.sh
#
# Expected output: two "Invalid free()" reports from valgrind, both with the
# stack  free -> verto_cleanup (libverto.so) -> main (gssproxy), and
# "ERROR SUMMARY: 2 errors from 2 contexts".

set -euo pipefail

# Resolve this script's own directory so the gssproxy sources are found
# regardless of the caller's working directory (this is the repo root).
SCRIPT_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)

if ! command -v nix-shell >/dev/null 2>&1; then
  echo "error: 'nix-shell' not found. Install Nix first: https://nixos.org/download" >&2
  exit 127
fi

# nixpkgs pinned to the exact revision in flake.lock, fetched by commit so no
# channel or <nixpkgs> entry is needed. The sha256 is the flake's locked
# narHash (the unpacked-tree hash fetchTarball also uses), so this is fully
# reproducible and offline-cacheable.
NIXPKGS_REV="9ae611a455b90cf061d8f332b977e387bda8e1ca"
NIXPKGS_SHA="sha256-md8WlXOlfnIeHeOScMTTHFyf2d6iaTwPl2apR5EQ3P4="

# A shell environment providing valgrind plus a gssproxy built inline from THIS
# checkout's sources. The derivation is defined here (not imported from nix/),
# so the script works against any checkout -- including ones with no Nix infra.
# It mirrors the autotools build: autoreconf, configure (with a DocBook catalog
# that configure mandates for the man pages), make, then install just the daemon
# binary. krb5 pulls in its embedded BUILTIN_MODULE libverto, which is exactly
# the configuration that triggers the verto_cleanup() invalid free.
read -r -d '' NIX_EXPR <<EOF || true
let
  pkgs = import (builtins.fetchTarball {
    url = "https://github.com/NixOS/nixpkgs/archive/${NIXPKGS_REV}.tar.gz";
    sha256 = "${NIXPKGS_SHA}";
  }) { };

  gssproxy = pkgs.stdenv.mkDerivation {
    pname = "gssproxy-verto-repro";
    version = "0";

    src = pkgs.lib.cleanSource ${SCRIPT_DIR};

    nativeBuildInputs = with pkgs; [
      autoreconfHook
      pkg-config
      gettext
      libxslt
      libxml2
      docbook-xsl-nons
      docbook_xml_dtd_45
      findutils
    ];

    buildInputs = with pkgs; [
      krb5
      libverto
      ding-libs
      keyutils
      popt
      systemdLibs
    ];

    configureFlags = [ "--with-initscript=none" "--with-selinux=no" ];

    # configure (external/docbook.m4) requires xsltproc/xmllint/xmlcatalog and a
    # catalog that resolves the DocBook manpages stylesheet, so build a combined
    # one that delegates to the DTD and (non-namespaced) XSL catalogs offline.
    preConfigure = ''
      combinedCatalog="\$PWD/.nix-docbook-catalog.xml"
      xmlcatalog --noout --create "\$combinedCatalog"
      xmlcatalog --noout --add nextCatalog "" \\
        "\${pkgs.docbook_xml_dtd_45}/xml/dtd/docbook/catalog.xml" "\$combinedCatalog"
      xmlcatalog --noout --add nextCatalog "" \\
        "\${pkgs.docbook-xsl-nons}/share/xml/docbook-xsl-nons/catalog.xml" "\$combinedCatalog"
      configureFlagsArray+=("--with-xml-catalog-path=\$combinedCatalog")
    '';

    enableParallelBuilding = true;
    doCheck = false;
    # Keep symbols so valgrind prints a readable verto_cleanup/main backtrace.
    dontStrip = true;

    installPhase = ''
      runHook preInstall
      install -Dm755 gssproxy "\$out/bin/gssproxy"
      runHook postInstall
    '';
  };
in
pkgs.mkShell { packages = [ pkgs.valgrind gssproxy ]; }
EOF

# The actual test, kept in its own file so the single quotes in the readiness
# string don't fight with the outer shell quoting.
INNER=$(mktemp)
trap 'rm -f "$INNER"' EXIT
cat >"$INNER" <<'INNER_EOF'
set -euo pipefail

D=$(mktemp -d)
trap 'rm -rf "$D"' EXIT

printf '[service/test]\n  mechs = krb5\n  euid = 0\n' >"$D/gp.conf"

valgrind \
  --tool=memcheck \
  --leak-check=no \
  --num-callers=20 \
  --error-exitcode=1 \
  --log-file="$D/vg.log" \
  gssproxy -i -s "$D/gp.sock" -c "$D/gp.conf" \
  >"$D/out" 2>&1 &
PID=$!

# Wait up to 10s for the daemon's readiness line before signalling it.
for _ in $(seq 50); do
  grep -q "Initialization complete" "$D/out" 2>/dev/null && break
  sleep 0.2
done

kill -TERM "$PID" 2>/dev/null || true
wait "$PID" 2>/dev/null || true   # valgrind exits non-zero once it finds errors

cat "$D/vg.log"
INNER_EOF

exec nix-shell --expr "$NIX_EXPR" --run "bash '$INNER'"

@simo5

simo5 commented Jun 18, 2026

Copy link
Copy Markdown
Member

None of the developers here uses nixos, sorry but this is not a useful and minimal reproducer.
There is no need to provide a script, even instructions and/or a log that show the issue is fine.

@simo5

simo5 commented Jun 18, 2026

Copy link
Copy Markdown
Member

By the way, re-reading the valgrind trace in one of the comments I now know what's up, I'll submit a proper fix later.

@tyler274

tyler274 commented Jun 18, 2026

Copy link
Copy Markdown
Author

None of the developers here uses nixos, sorry but this is not a useful and minimal reproducer.
There is no need to provide a script, even instructions and/or a log that show the issue is fine.

That is why I tried to make the script compatible with Debian and Fedora. Nix is a package manager that works on multiple operating systems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants